A Bayesian Approach for the Clustering of Short Time Series
نویسنده
چکیده
Microarrays allow monitoring of thousands of genes over time periods. However, due to the low number of time points of the gene expression series, taking the temporal dependences into account when clustering the data is an hard task. Moreover, classes very interesting for the biologist, but sparse with regard to all the other genes, can be completely omitted by the standard approaches. We propose a Bayesian approach for this problem. A mixture model is used to describe and classify the data. The parameters of this model are constrained by a prior distribution defined with a new type of model that expresses our prior knowledge. These knowledge allow to take the temporal dependences into account in natural way, as well as to express rough temporal profiles about classes of interest. RÉSUMÉ. Des technologies récentes telles que les microarrays permettent de mesurer le niveau d’expression de milliers de gènes au cours du temps. Cependant, le nombre de points réduit de ces séries temporelles rend difficile la prise en compte des dépendances entre les temps par les algorithmes de classification. De plus, certaines des classes les plus intéressantes pour le biologiste peuvent être totalement omises par les algorithmes classiques du fait du faible nombre de gènes qui les composent. Nous proposons une approche bayésienne de ce problème. Un modèle de mélange est utilisé pour décrire et classer les données. Les paramètres de ce modèle sont contraints par une distribution a priori définie grâce à un nouveau type de modèles qui exprime les connaissances a priori dont on dispose. Ces connaissances permettent de traiter les dépendances temporelles d’une manière très naturelle, et de prendre en compte des connaissances approximatives concernant les profils temporels les plus intéressants.
منابع مشابه
Combination of Transformed-means Clustering and Neural Networks for Short-Term Solar Radiation Forecasting
In order to provide an efficient conversion and utilization of solar power, solar radiation datashould be measured continuously and accurately over the long-term period. However, the measurement ofsolar radiation is not available to all countries in the world due to some technical and fiscal limitations. Hence,several studies were proposed in the literature to find mathematical and physical mod...
متن کاملFuzzy clustering of time series data: A particle swarm optimization approach
With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملکاربرد آنالیز طیفی بیزی در تحلیل سریهای زمانی نورسنجی
The present paper introduces the Bayesian spectral analysis as a powerful and efficient method for spectral analysis of photometric time series. For this purpose, Bayesian spectral analysis has programmed in Matlab software for XZ Dra photometric time series which is non-uniform with large gaps and the power spectrum of this analysis has compared with the power spectrum which obtained from the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Revue d'Intelligence Artificielle
دوره 20 شماره
صفحات -
تاریخ انتشار 2006